22 research outputs found

    Distribution and phylogeny of the bacterial translational GTPases and the Mqsr/YgiT regulatory system

    Get PDF
    Väitekirja elektrooniline versioon ei sisalda publikatsioone.Valgud on raku ehituskivideks ja eluks vajalike reaktsioonide katalüüsijateks. Bioinformaatika on meid varustanud võimsate järjestuste analüüsi vahenditega. Järjestuse sarnasuse alusel grupeeruvad valgud perekondadeks. Valguperekonna moodustavad homoloogsed järjestused ehk siis järjestused, mis pärinevad samast eellasjärjestusest. Tihti omavad samasse perekonda kuuluvad valgud ka sama või üksteisele lähedast funktsiooni. Meie teadmised valkude funktsioonidest pärinevad üksikutelt mudelorganismidelt. Tihti huvitab teadlasi kui universaalne või spetsiifiline on üks või teine kirjeldatud funktsioon. Kuidas ja millal evolutsiooni käigus tekib olemasolevast materjalist uute omadustega (uue funktsiooniga) valk läbi geeniduplikatsiooni? Kui tihti on sellised sündmused evolutsioonilises ajaskaalas aset leidud? Oma töös olen ma analüüsinud bakterite translatsioonilisi GTPaase (trGTPaas) ja mqsR/ygiT toksiin-antitoksiin (TA) süsteemi valke. Ühiseks nime¬¬tajaks mõlemale on valgusünteesi aparaat – mõlemad on seotud ribosoomiga ja sealtkaudu raku võimega sõltuvalt vajadusele toota valke. Küsimused, mida selles kontekstis on küsitud, saab laias laastus jagada kaheks: a) valguperekonna esindatusega seotud ja b) valguperekonna evolutsiooni ja funktsionaalse innovatsiooniga seotud. Translatsiooniliste GTPaaside puhul bakterites saame rääkida üheksast erinevast perekonnast – üheksast erinevast funktsioonide komplektist. Täisgenoomidele põhinev analüüs näitas, et üheksast trGTPaaside perekonnast on bakterites konserveerunud neli: IF2, EF-Tu, EFG ja LepA(EF4). Vaatamata sellele, et RF3’e on omistatud klassikalise valgusünteesi mudeli valguses kanooniline roll translatsiooni lõpetamisel, puudus RF3 geen ligikaudu 40% analüüsitud bakteri genoomides. Samas aga ebaselge funktsiooniga LepA osutus bakterite spetsiifiliseks trGTPaasiks. Eelnev analüüs tõi ka välja EFG paraloogide laia esinemise – paljud bakteri¬genoomid sisaldasid 2–3 üksteisest küllaltki erinevat (divergeerunud) EFG geeni. Lähem analüüs tõi välja, et kogu varieeruvuse EFG perekonnas võib jagada neljaks alamperekonnaks: EFG I, spdEFG1, spdEFG2 ja EFG II. Eksperimentaalselt on hästi iseloomustatud EFG I. Uuritud on ka spdEFG’sid ja leitud, et esimene neist omab translokaasi aktiivsust translatsioonil ja teine osaleb ribosoomide retsükleerimisel. Laialt levinud EFG II alamperekond on aga halvasti uuritud. Fülogeneetiline analüüs võimaldab püstitada hüpoteesi nelja EFG alamperekonna iidsest päritolust, st. nad on tekkinud ajalises skaalas enne (või samaaegselt) eukarüootse rakuvormi lahknemist arhedest ja bakteritest. Funktsionaalse innovatsiooni kandjaks EFG II valgus võib pidada eelkõige 12 positsiooni, mis on spetsiifiliselt konserveerunud just EFG II alamperekonnal. EFG II’e iseloomulikus kõrge divergentsuse taustal tõusevad need positsioonid esile GTPaasi domäänis, domäänis II ja neljandas domäänis. Konserveerunud muutused GTPaasi domäänis, millest osad on GTP’d siduvas G1 motiivis, võimaldavad teha järeldusi muutunud GTP sidumise ja hüdrolüüsi tingimuste kohta. Suurenenud laeng neljanda domääni lingu otsas, mis E. coli EFG’l siseneb A-saiti, võimaldab spekuleerida muutuse üle translokatsiooni keskkonnas. Konserveerunud muutused domään II piirkonnas viitavad muutunud interaktsioonile ribosoomi, domään I ja domään III vahel. EFG II alamperekonna fülogeneetiline ja järjestuste analüüs näitab selgelt hõimkonna/klassi spetsiifiliste alam-alamgruppide olemasolu. Need alam-alamgrupid erinevad teineteisest G2 motiivi konserveeruvuse ja insertsioonide/deletsioonide mustri alusel. See teine tase kirjeldab EFG II kui hõimkonna/klassi spetsiifilist faktorit. Mis on EFG II roll tegelikult ja kuidas ning millistes tingimustes ta komplementeerib EFG I, ootab alles vastuseid. Antud töö on loonud raamistiku tulevaste eksperimentide tarvis.Proteins are vital for the cell – they serve as building blocks and catalysts for many different reactions. Bioinformatics has equipped us with powerful analysis tools. According to sequence similarity, proteins can be grouped into families. Protein family is composed of homologous sequences, i. e. from sequences, which share a common ancestor. Proteins, which belong to the same family, perform their function in a similar way. Our knowledge about functional properties of proteins originates from experimental works performed with a limited number of model organisms. Scientists are often interested in the universality or specificity of one or another described protein and function. How often is gene duplication and following innovation the source for genes/proteins with a new function? How often such events take place in the evolutionary timescale? In my dissertation I have analyzed gene and protein sequences of translational GTPases (trGTPases) and mqsR/ygiT toxin-antitoxin of bacteria. Common denominator for both protein families is their connection to cells protein synthesis machinery. Two types of questions can be asked in this context: those that are related to a) the representation of specific proteins/function, and b) the evolution and functional innovation. In the case of trGTPases nine different protein families, i. e. presence or absence of nine different functional complexes in the cell were described. Analyzes carried on completed genome sequences of bacteria revealed four conserved families: IF2, EF-Tu, EFG, and LepA(EF4). Despite the fact that in the classical model of protein synthesis RF3 carries canonic role at the final step of translation, RF3 coding gene was found missing approximately in 40% of analyzed bacteria. Surprisingly, LepA, whose function is still not well understood, appears to be specific trGTPase for bacteria. The analysis also revealed a wide distribution of EFG paralogs – many bacteria contained two to three relatively diverged gene copies for EFG. The phylogenetic tree of EFG revealed four subfamilies: EFG I, spdEFG1, spdEFG2, and EFG II. The EFG I subfamily is experimentally well characterized. Also, spdEFG1 was found to act as translocase and spdEFG2 helps recycle ribosome, indicating functional split between co-occurring paralogs. However, little research has been done on widely distributed EFG II subfamily. Phylogenetic analyses, performed by us, enable to propose hypothesis about ancient origin of EFG subfamilies - they have appeared at the same timescale with (or even before) arousing eukaryotic life-forms. Functional innovation, common for the whole subfamily, is carried by 12 EFG II specific positions. In contrast to overall high divergeny, these conserved positions have spotlighted in the GTPase domain, and in the domain II and IV. Conserved changes in the GTPase domain, some of which are located in the G1 motif, indicate changed conditions in GTP binding and hydrolysis. Increased charge in protruding loop of the fourth domain, which inserts into A-site, enables us to speculate about changes in the local conditions of the A-site during translocation. Conserved changes in the domain II indicate changed interaction between EFG domains I, II, and III and the ribosome. Phylogenetic analysis of the EFG II subfamily reveals phyla/class specific sub-subgroups. These sub-subgroups differ from each other by conserved amino acids pattern of the G2 motif and insertion/deletion pattern detected from multiple sequence alignment. This another level characterizes EFG II as phyla/class specific factor. Further research should be conducted on what role EFG II actually performs and how it complements EFG I. Current study can serve as framework for future experiments

    Explaining the Imperfection of the Molecular Clock of Hominid Mitochondria

    Get PDF
    The molecular clock of mitochondrial DNA has been extensively used to date various genetic events. However, its substitution rate among humans appears to be higher than rates inferred from human-chimpanzee comparisons, limiting the potential of interspecies clock calibrations for intraspecific dating. It is not well understood how and why the substitution rate accelerates. We have analyzed a phylogenetic tree of 3057 publicly available human mitochondrial DNA coding region sequences for changes in the ratios of mutations belonging to different functional classes. The proportion of non-synonymous and RNA genes substitutions has reduced over hundreds of thousands of years. The highest mutation ratios corresponding to fast acceleration in the apparent substitution rate of the coding sequence have occurred after the end of the Last Ice Age. We recalibrate the molecular clock of human mtDNA as 7990 years per synonymous mutation over the mitochondrial genome. However, the distribution of substitutions at synonymous sites in human data significantly departs from a model assuming a single rate parameter and implies at least 3 different subclasses of sites. Neutral model with 3 synonymous substitution rates can explain most, if not all, of the apparent molecular clock difference between the intra- and interspecies levels. Our findings imply the sluggishness of purifying selection in removing the slightly deleterious mutations from the human as well as the Neandertal and chimpanzee populations. However, for humans, the weakness of purifying selection has been further exacerbated by the population expansions associated with the out-of Africa migration and the end of the Last Ice Age

    Geograafilise ekspertsüsteemi “Koolivõrk” loomine

    Get PDF
    Käesolev projekt on ülevaade esimestest sammudest koolivõrgu ekspertvahendi loomisest. Töö kulges kahes suunas. Üks keskendas tähelepanu ekspertsüsteemile, mis on ümberhäälestatav erinevatele temaatilistele mudelitele ja tarkvaralahendustele. Teine suund pööras tähelepanu reaalsetele andmetele ja nende interpreteerimisele, et valminud tarkvaralahendused annaksid koheselt koolivõrgu üle otsustamiseks kasutatavaid tulemusi. Kogu tegevus toimus ja tulemus saavutati tihedas koostöös Haridus- ja Teadusministeeriumiga ja Poliitikauuringute Keskusega Praxis

    A role for the Saccharomyces cerevisiae ABCF protein New1 in translation termination/recycling

    Get PDF
    Translation is controlled by numerous accessory proteins and translation factors. In the yeast Saccharomyces cerevisiae, translation elongation requires an essential elongation factor, the ABCF ATPase eEF3. A closely related protein, New1, is encoded by a non-essential gene with cold sensitivity and ribosome assembly defect knock-out phenotypes. Since the exact molecular function of New1 is unknown, it is unclear if the ribosome assembly defect is direct, i.e. New1 is a bona fide assembly factor, or indirect, for instance due to a defect in protein synthesis. To investigate this, we employed yeast genetics, cryo-electron microscopy (cryo-EM) and ribosome profiling (Ribo-Seq) to interrogate the molecular function of New1. Overexpression of New1 rescues the inviability of a yeast strain lacking the otherwise strictly essential translation factor eEF3. The structure of the ATPase-deficient (EQ2) New1 mutant locked on the 80S ribosome reveals that New1 binds analogously to the ribosome as eEF3. Finally, Ribo-Seq analysis revealed that loss of New1 leads to ribosome queuing upstream of 3'-terminal lysine and arginine codons, including those genes encoding proteins of the cytoplasmic translational machinery. Our results suggest that New1 is a translation factor that fine-tunes the efficiency of translation termination or ribosome recycling

    A Computational Study of Elongation Factor G (EFG) Duplicated Genes: Diverged Nature Underlying the Innovation on the Same Structural Template

    Get PDF
    BACKGROUND: Elongation factor G (EFG) is a core translational protein that catalyzes the elongation and recycling phases of translation. A more complex picture of EFG's evolution and function than previously accepted is emerging from analyzes of heterogeneous EFG family members. Whereas the gene duplication is postulated to be a prominent factor creating functional novelty, the striking divergence between EFG paralogs can be interpreted in terms of innovation in gene function. METHODOLOGY/PRINCIPAL FINDINGS: We present a computational study of the EFG protein family to cover the role of gene duplication in the evolution of protein function. Using phylogenetic methods, genome context conservation and insertion/deletion (indel) analysis we demonstrate that the EFG gene copies form four subfamilies: EFG I, spdEFG1, spdEFG2, and EFG II. These ancient gene families differ by their indispensability, degree of divergence and number of indels. We show the distribution of EFG subfamilies and describe evidences for lateral gene transfer and recent duplications. Extended studies of the EFG II subfamily concern its diverged nature. Remarkably, EFG II appears to be a widely distributed and a much-diversified subfamily whose subdivisions correlate with phylum or class borders. The EFG II subfamily specific characteristics are low conservation of the GTPase domain, domains II and III; absence of the trGTPase specific G2 consensus motif "RGITI"; and twelve conserved positions common to the whole subfamily. The EFG II specific functional changes could be related to changes in the properties of nucleotide binding and hydrolysis and strengthened ionic interactions between EFG II and the ribosome, particularly between parts of the decoding site and loop I of domain IV. CONCLUSIONS/SIGNIFICANCE: Our work, for the first time, comprehensively identifies and describes EFG subfamilies and improves our understanding of the function and evolution of EFG duplicated genes

    A Meta-analysis of Gene Expression Signatures of Blood Pressure and Hypertension

    Get PDF
    Genome-wide association studies (GWAS) have uncovered numerous genetic variants (SNPs) that are associated with blood pressure (BP). Genetic variants may lead to BP changes by acting on intermediate molecular phenotypes such as coded protein sequence or gene expression, which in turn affect BP variability. Therefore, characterizing genes whose expression is associated with BP may reveal cellular processes involved in BP regulation and uncover how transcripts mediate genetic and environmental effects on BP variability. A meta-analysis of results from six studies of global gene expression profiles of BP and hypertension in whole blood was performed in 7017 individuals who were not receiving antihypertensive drug treatment. We identified 34 genes that were differentially expressed in relation to BP (Bonferroni-corrected p<0.05). Among these genes, FOS and PTGS2 have been previously reported to be involved in BP-related processes; the others are novel. The top BP signature genes in aggregate explain 5%–9% of inter-individual variance in BP. Of note, rs3184504 in SH2B3, which was also reported in GWAS to be associated with BP, was found to be a trans regulator of the expression of 6 of the transcripts we found to be associated with BP (FOS, MYADM, PP1R15A, TAGAP, S100A10, and FGBP2). Gene set enrichment analysis suggested that the BP-related global gene expression changes include genes involved in inflammatory response and apoptosis pathways. Our study provides new insights into molecular mechanisms underlying BP regulation, and suggests novel transcriptomic markers for the treatment and prevention of hypertension

    New genetic loci link adipose and insulin biology to body fat distribution.

    Get PDF
    Body fat distribution is a heritable trait and a well-established predictor of adverse metabolic outcomes, independent of overall adiposity. To increase our understanding of the genetic basis of body fat distribution and its molecular links to cardiometabolic traits, here we conduct genome-wide association meta-analyses of traits related to waist and hip circumferences in up to 224,459 individuals. We identify 49 loci (33 new) associated with waist-to-hip ratio adjusted for body mass index (BMI), and an additional 19 loci newly associated with related waist and hip circumference measures (P < 5 × 10(-8)). In total, 20 of the 49 waist-to-hip ratio adjusted for BMI loci show significant sexual dimorphism, 19 of which display a stronger effect in women. The identified loci were enriched for genes expressed in adipose tissue and for putative regulatory elements in adipocytes. Pathway analyses implicated adipogenesis, angiogenesis, transcriptional regulation and insulin resistance as processes affecting fat distribution, providing insight into potential pathophysiological mechanisms

    Modulation of Genetic Associations with Serum Urate Levels by Body-Mass-Index in Humans

    Get PDF
    We tested for interactions between body mass index (BMI) and common genetic variants affecting serum urate levels, genome-wide, in up to 42569 participants. Both stratified genome-wide association (GWAS) analyses, in lean, overweight and obese individuals, and regression-type analyses in a non BMI-stratified overall sample were performed. The former did not uncover any novel locus with a major main effect, but supported modulation of effects for some known and potentially new urate loci. The latter highlighted a SNP at RBFOX3 reaching genome-wide significant level (effect size 0.014, 95% CI 0.008-0.02, P-inter= 2.6 x 10(-8)). Two top loci in interaction term analyses, RBFOX3 and ERO1LB-EDAR-ADD, also displayed suggestive differences in main effect size between the lean and obese strata. All top ranking loci for urate effect differences between BMI categories were novel and most had small magnitude but opposite direction effects between strata. They include the locus RBMS1-TANK (men, Pdifflean-overweight= 4.7 x 10(-8)), a region that has been associated with several obesity related traits, and TSPYL5 (men, Pdifflean-overweight= 9.1 x 10(-8)), regulating adipocytes-produced estradiol. The top-ranking known urate loci was ABCG2, the strongest known gout risk locus, with an effect halved in obese compared to lean men (Pdifflean-obese= 2 x 10(-4)). Finally, pathway analysis suggested a role for N-glycan biosynthesis as a prominent urate-associated pathway in the lean stratum. These results illustrate a potentially powerful way to monitor changes occurring in obesogenic environment.Peer reviewe

    t RNA-de seondumine prokarüoodi polüsoomidega in vivo

    No full text

    Phylogenetic distribution of translational GTPases in bacteria

    No full text
    Abstract Background Translational GTPases are a family of proteins in which GTPase activity is stimulated by the large ribosomal subunit. Conserved sequence features allow members of this family to be identified. Results To achieve accurate protein identification and grouping we have developed a method combining searches with Hidden Markov Model profiles and tree based grouping. We found all the genes for translational GTPases in 191 fully sequenced bacterial genomes. The protein sequences were grouped into nine subfamilies. Analysis of the results shows that three translational GTPases, the translation factors EF-Tu, EF-G and IF2, are present in all organisms examined. In addition, several copies of the genes encoding EF-Tu and EF-G are present in some genomes. In the case of multiple genes for EF-Tu, the gene copies are nearly identical; in the case of multiple EF-G genes, the gene copies have been considerably diverged. The fourth translational GTPase, LepA, the function of which is currently unknown, is also nearly universally conserved in bacteria, being absent from only one organism out of the 191 analyzed. The translation regulator, TypA, is also present in most of the organisms examined, being absent only from bacteria with small genomes. Surprisingly, some of the well studied translational GTPases are present only in a very small number of bacteria. The translation termination factor RF3 is absent from many groups of bacteria with both small and large genomes. The specialized translation factor for selenocysteine incorporation – SelB – was found in only 39 organisms. Similarly, the tetracycline resistance proteins (Tet) are present only in a small number of species. Proteins of the CysN/NodQ subfamily have acquired functions in sulfur metabolism and production of signaling molecules. The genes coding for CysN/NodQ proteins were found in 74 genomes. This protein subfamily is not confined to Proteobacteria, as suggested previously but present also in many other groups of bacteria. Conclusion Four of the translational GTPase subfamilies (IF2, EF-Tu, EF-G and LepA) are represented by at least one member in each bacterium studied, with one exception in LepA. This defines the set of translational GTPases essential for basic cell functions.</p
    corecore